Worst-Case Optimal Join Algorithms: Techniques, Results, and Open Problems
نویسنده
چکیده
Worst-case optimal join algorithms are the class of join algorithms whose runtime match the worst-case output size of a given join query. While the first provably worse-case optimal join algorithm was discovered relatively recently, the techniques and results surrounding these algorithms grow out of decades of research from a wide range of areas, intimately connecting graph theory, algorithms, information theory, constraint satisfaction, database theory, and geometric inequalities. These ideas are not just paperware: in addition to academic project implementations, two variations of such algorithms are the work-horse join algorithms of commercial database and data analytics engines. This paper aims to be a brief introduction to the design and analysis of worst-case optimal join algorithms. We discuss the key techniques for proving runtime and output size bounds. We particularly focus on the fascinating connection between join algorithms and information theoretic inequalities, and the idea of how one can turn a proof into an algorithm. Finally, we conclude with a representative list of fundamental open problems in this area.
منابع مشابه
Worst-Case Optimal Join at a Time
Joins are at the core of database systems, yet worst-case optimal join algorithms have been developed only recently. At the outset of this effort is the observation that the standard join plans are suboptimal as their intermediate results may be larger than the final result. To attain worst-case optimality, new join algorithms are monolithic and thus avoid intermediate results. The conceptual c...
متن کاملLeapfrog Triejoin: A Simple, Worst-Case Optimal Join Algorithm
Recent years have seen exciting developments in join algorithms. In 2008, Atserias, Grohe and Marx (henceforth AGM) proved a tight bound on the maximum result size of a full conjunctive query, given constraints on the input relation sizes. In 2012, Ngo, Porat, Ré and Rudra (henceforth NPRR) devised a join algorithm with worst-case running time proportional to the AGM bound [8]. Our commercial D...
متن کاملLinks between Join Processing and Convex Geometry
This talk will survey some results on join processing that use inequalities from convex geometry. Recently, Ngo, Porat, Rudra, and Ré (NPRR) discovered the first relational join algorithm with worst-case optimal running time [8]. Since the seminal System R project [12], the dominant database optimizer paradigm optimizes a join query by examining each pair of joins and then combining these estim...
متن کاملTriejoin: A Simple, Worst-Case Optimal Join Algorithm
Recent years have seen exciting developments in join algorithms. In 2008, Atserias, Grohe and Marx (henceforth AGM) proved a tight bound on the maximum result size of a full conjunctive query, given constraints on the input relation sizes. In 2012, Ngo, Porat, Ré and Rudra (henceforth NPRR) devised a join algorithm with worst-case running time proportional to the AGM bound [8]. Our commercial d...
متن کاملLeapfrog Triejoin: a worst-case optimal join algorithm
Recent years have seen exciting developments in join algorithms. In 2008, Atserias, Grohe and Marx (henceforth AGM) proved a tight bound on the maximum result size of a full conjunctive query, given constraints on the input relation sizes. In 2012, Ngo, Porat, R«e and Rudra (henceforth NPRR) devised a join algorithm with worst-case running time proportional to the AGM bound [8]. Our commercial ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018